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Abstract 

This paper considers maximum likelihood inference for a functional marked 
point process - the stochastic growth-interaction process - which is an exten- 
sion of the spatio-temporal growth-interaction process to the stochastic mark 
setting. As a pilot study we here consider a particular version of this ex- 
tended process, which has a homogenous Poisson process as unmarked point 
process and shifted independent Cox-IngersoU-Ross processes as functional 
marks. These marks have supports determined by the lifetimes generated by 
an immigration-death process. By considering a (temporally) discrete sample 
scheme for the marks and by considering the process' alternative evolutionary 
representation as a multivariate diffusion (Markovian) with jumps, the likeli- 
hood function is expressed as a product of the process' closed form transition 
densities. Additionally, under the assumption that the mark processes are 
started in their common stationary distribution, and under some restrictions 
on the underlying parameters, consistency and asymptotic normality of the 
maximum likelihood (ML) estimators are proved. The ML-estimators derived 
from the stationarity assumption are then compared numerically to the ML- 
estimators derived under non-stationarity, in order to investigate the robustness 
of the stationarity assumption. To illustrate the model's use in forestry, it is 
fitted to a data set of Scots pines. 

Keywords: Asymptotic normality, Consistency, Cox-Ingersoll-Ross process. Func- 
tional marked point process. Immigration- death process. Maximum likelihood. 
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1 Introduction 



Renshaw and Sarkka presented their spatio-temporal growth-interaction (GI) process 
in [28], and usually this process is described as a spatio-temporal point process (see 
e.g. |3l]) with growing and interacting marks. The Gl-process then has been further 
studied in a series of papers (e.g. [HI [IDl El ESI EZl El]), where, among other things, 
different inference tools have been developed. 

Consider some suitable spatial study region (usually some subset of or a 
torus). The basis of the Gl-process is spatio-temporal point process which here 
will be referred to as a spatial immigration-death (SID) process. Specifically, the 
SID-process lets new individuals (points) arrive to a population (the study region) 
according to the jumps of a Poisson process on M+, assigns iid locations to them, 
which are uniformly distributed in the study region, and removes the points after iid 
exponential times (the temporal dynamics of the SID-process constitute a so-called 
immigration-death process; see e.g. j^). Moreover, once an individual has received 
its location, centred on its location we place a closed disk/ball with some given radius 
(the initial mark). As time evolves, we let the radius of each disk grow according to a 
given deterministic growth structure, which depends on the radius itself as well as the 
locations and the radii of the other (neighbouring) marked points. More precisely, 
the structure of the simultaneous growth of the radii is given by a system of ordinary 
differential equations (ODEs). Furthermore, this system of ODEs is such that when 
an individual has not yet arrived to the population or if it has been removed (which 
is governed by the SID-process), its corresponding component of the system of ODEs 
is set to zero. Considering its possible application areas, one important area is the 
dynamical modelling of a forest stand. Here, as time passes, new trees arrive, grow 
and compete with each other until they die. When we let each growth equation (ODE 
component) contain an inhibitive part, such that the growth of a point's radius/mark 
is reduced when the point is surrounded (spatially) by points with large marks, this 
inhibition reflects the natural competition for nutrients and light among trees in a 
forest. 

The description of the Gl-process as a spatio-temporal marked point process is 
rather vague and questions regarding an appropriate representation quickly emerge. 
Following [7], we will call any marked point process where the marks are function- 
valued a functional marked point process, and it has been pointed out by [7] that 
the Gl-process in fact should be represented as a functional marked point process. 
More specifically, it can be treated as a marked point process (see e.g. [TT| [T3| |30] ) 
for which the location space is that of a spatial Poisson process and the mark space 
(representing the radii of the balls/disks) is given by the space of cadlag (right 
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continuous with existing left limits) functions (see e.g. This can be realised 

by letting the unmarked part of this functional marked point process be given by 
the collection of points scattered in the study region by the underlying SID-process 
during the time interval we are studying the Gl-process. Moreover, we see that 
each mark consists of three parts: 1) the size of the mark before the individual has 
arrived (zero), 2) the size when the individual is present (governed by the ODEs), 
and 3) the size once the individual is removed (zero). Since there clearly are jumps 
between parts 1) and 2), and parts 2) and 3), we see that it makes sense to consider 
a mark space which is of cadlag-type. Note that the connection between these two 
representations of the Gl-process can be compared to the representations of a one- 
dimensional Poisson process as either a point process (random measure) on i?+ or 
Levy-process (evolutionary process). 

Naturally the growth structure of the Gl-process includes some parameters that 
need to be estimated when the model is fitted to data. For the case where the 
process is sampled at discrete times (hence creating a time series of marked point 
patterns), [29] suggested a least-squares scheme to estimate the parameters related 
to the growth and interaction of the marks (the parameters in the ODEs). This 
estimation was further considered in [TU], where a spatio-temporal edge correction 
was added to the estimation procedure. Regarding the parameters of the underlying 
(spatial) immigration-death process, which are the arrival and death rates of individ- 
uals, these have been estimated separately by means of different maximum likelihood 
(ML) estimation approaches (see e.g. [T0 | [9| [T6 | l29 ] ). In j9] the full ML-estimation of 
the discretely sampled immigration-death process was treated, and, besides treating 
some practical aspects of the estimation, consistency and asymptotic normality of 
the ML-estimators were proved. 

We note that it is unlikely that all marks in a marked point pattern (e.g. trees 
in a forest stand) have the same (deterministic) underlying growth pattern, as is the 
assumption in the original Gl-process. For instance, when we model a forest stand, in 
order to reflect phenomena such as (cumulative) measurement errors and individual 
growth features of each tree, there should be some noise or randomness present in the 
part of the model which handles the growth of the marks. Trying to rectify this lack of 
individuality in the mark growth structures, our main aim here will be to take a first 
step in the process of adding randomness to the growth of the marks. The approach 
chosen here is to add a (scaled continuous time) white noise to each component of 
the system of mark growth ODEs in the Gl-process (see "part 2)" above). This will 
generate a set of (possibly dependent) Brownian motion driven stochastic differential 
equations (SDEs) with jumps at the birth and death times (see e.g. |2H |22| [25| [3T]). 
In other words, we will be considering a functional marked point process with marks 
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given by diffusion processes with jumps. By considering the temporal evolution of 
such a process, we may also say that we have defined a multivariate stochastic jump 
(diffusion) process which is time shifted and parametrized by an SID-process. We 
note that turning the marks into diffusions has some clear advantages. To begin 
with, as a consequence of the randomness in the mark growth equations, we may 
be able to write down a likelihood structure, which is based on sampling the marks 
over time, so that we can treat the simultaneous ML-estimation of the whole GI- 
process instead of using the separate spatial and temporal estimators previously 
considered. In particular, we may exploit the Markovianity of the diffusions and 
the SID-process in the derivation of a full likelihood function. Additionally, given 
a temporally discrete sampling scheme, standard tools from likelihood theory can 
help us derive results about the asymptotic behaviour of the estimators (see e.g. 
[T^ I2D1 El E3]), when the process is sampled at discrete times. Ultimately it may 
also be possible to use tools from stochastic process theory and stochastic calculus 
to derive other theoretical results, such as asymptotic distributional properties of 
functionals of the process, which in the non-stochastic version of the Gl-process only 
have been achievable through simulations. 

Here, as an initial extension of the Gl-process to the setting where the marks are 
stochastic, we will assume that all diffusions (mark radius processes) will be inde- 
pendent and of the same type (thereby generated from the same set of parameters). 
Note that we hereby remove the interaction between the marks which was present in 
the Gl-process. Hence, we obtain a functional marked point process with indepen- 
dent diffusion process marks. We have chosen to study only the Cox-IngersoU-Ross 
(CIR) process (see e.g. |S1 [HI [20]) to describe the marks (growth of the radii of the 
disks). However, any other strictly positive diffusion which meets the requirements 
of the modelling setting in question could be chosen. Note that a nice property of 
the CIR-process (besides being strictly positive under certain restrictions) is that it 
is one of the few diffusions which possesses known closed form expressions for its 
transition densities, and since the transition densities in turn are the building blocks 
of the likelihood function we can obtain a closed form expression for the likelihood 
function. 

The paper is organised as follows. In Section |2]the stochastic Gl-process is defined 
and in Section [3] we move on to further discuss some of its distributional properties 
and its building blocks. Then, in Section HI with the finite dimensional distributions 
at hand, we define the ML-estimation regime which, in Section |5l in turn allows 
us to look at large sample properties of ML-estimators. Since the consistency and 
the asymptotic normality of the ML-estimators are proved when the mark processes 
are stationary, in Section we wish to see how robust these estimators are to the 
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stationarity assumption. In Section [7] we evaluate the estimators on the same set 
of Scots pine data considered in jlO| and in Section [S] further comments/possible 
extensions are given as well as a general discussion of the paper. Finally, in the 
Appendix proofs of the main results can be found. 

2 The SG(I) process 

The stochastic growth-interaction (SGI) process is a functional marked point 
process which can be considered to be a stochastic extension of the (non-stochastic) 
growth-interaction (GI) process (see e.g. [T m i28 | [29 | l6]). Heuristically $m is described 
in the following way. As time evolves, balls/disks (marked points) appear in a spatial 
study region W at stochastic times and the radii of these balls/disks change size 
randomly over time until they disappear after stochastic times. Due to the usual 
biological context it will be natural to refer to each point as an individual. As 
previously noted, we here consider a simplified version of the SGI-process since we 
do not include any interaction between the marked points, and we therefore will 
refer to $m as the stochastic growth (SG) process. It should be pointed out that 
due to the Gl-process' forestry application it is usual that we illustrate the process 
in such a way that its marks describe the aforementioned growth of disks in (the 
space occupied by the tree stocks), however, this need not be the case. For instance, 
when modelling the spatio-temporal development of a forest stand, one could instead 
consider the case where the marks are used to describe, say, the development of the 
height of the trees. 

As was mentioned in Section [H there are essentially two ways to construct the 
SG-process In the first representation of the SG-process, which is given in 

Section 12. ![ $ m is obtained by treating it as a functional marked point process with 
cadlag function- valued marks. The second representation (Section [52]) is obtained by 
considering the temporal evolution of the mark processes (which are parametrized by 
the other relevant information). Note that the latter representation is of significance 
since it will be exploited when we develop the statistical inference for the SG-process 
when the marks are sampled at discrete times. 

Throughout we will assume that the spatial study region is given by a subset 
W of (i-dimensional Euclidean space, d>l, with Borel sets B{W) C B{R'^) and 
Lebesgue measure uCW) < oo (note that this also includes the case of identifying the 
edges of a rectangle in order to construct a torus). Moreover, we write N = {0, 1, . . .} 
and we denote the Euclidean norm and metric by |a;| and d{x,y), respectively, for 
X, y G M'^, d > 1. Furthermore, for a given set A, 1a{(i) = l{a G A} will denote the 
related indicator function and \A\ will denote the related cardinality (it will be clear 



4 



from context whether we consider the norm or the cardinahty). 

Following [7] and the construction of a functional marked point process given 
therein, the mark space will be given by the set F = D[o,t](1^); T G M+ = (0, oo), 
of cadlag (right continuous with existing left limits) functions / : [0,T] — M (or 
F = D[o,oo)(I^) when / : [0, oo) — M). The underlying probability space will be 
denoted by 

2.1 Functional marked point process representation of the 
SG-process 

Assume now that the SG-process under consideration is given by a cadlag functional 
marked point process $m = {[Xj,Mj] : Xi G $'}, with locations Xj G and 
functional marks Mj G F. 

More specifically, we let the unmarked process $' = {Xi, . . . ,Xi\f} be given by 
a homogeneous Poisson process on W, with intensity ai'{W)T, a G 6q, C M_j.. We 
note that we hereby have N ~ Poi{ai^{W)T) locations Xi, . . . ,Xi\f G W which are 
iid ?7nz(H^)-distributed (their indices are assigned to them according to their "birth 
times" defined below). 

We now turn to the construction of the F-valued random functional marks 
Mi,...,Mn, which we will require to be almost surely (a.s.) positive. In or- 
der to generate the supports supp(Mi) = {t G [0, T] : Mi{t) ^ 0} = [5i,A), 
i = 1, . . . ,N, conditionally on $' (or simply N), let . . . , be iid Uni{0, T)- 
distributed random variables (relabelled according to ascending size) and let ad- 
ditionally Li, . . . , Ln be iid i?xp(/x)-distributed, fi G 6^ C R_,_. By now defining 
Di = {Bi + Li) AT = mm{Bi + Li,T}, i = 1, . . . , N, we have that Mi{t) = for all 
t ^ [Bi, Di) and Mi{t) > for all t G [Bi, A) a.s.. 

We note here that the "birth/arrival times" Bi < . . . < Bn form a Poisson process 
on [0,T] with intensity aulW). In order to use a terminology which illustrates how 
$A/ can be used to model the dynamics of a population (e.g. a forest stand), in 
connection to the birth times, we additionally call Li, . . . , Lpi the "lifetimes" and 
Di, . . . , D]\f the "death times" of the individuals. 

As previously mentioned, each mark Mj = {Mi(t)}tg[o,T], i = 1, • • • , A^, can be 
illustrated by the space which it occupies in at a given time t. This is done by 
means of the ball Bxi[Mi{t)] = {y e R'^ : d{Xi,y) < Mi{t)}, with centre Xi and 
radius Mi{t). Hereby, for a fixed time t G [0,T], $jv/ can be illustrated as a forest 
stand, or rather a Boolean model (see e.g. [22]), by considering the union of the disks 
or trees IJ^^ ^Xi[Mi{t)] or the union of all disks Bxi[Mi{t)], i = 1, . . . , N, such that 
t G [Bi,Di) (whereby we only observe "alive individuals"). 
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Considering now to the actual structure put on each Mj = {Mj(t)}ig[o,T] when 
t G [Bi,Di) (i.e. when the zth individual is alive), we will let Mi{Bi) = be the 
initial size of the ith mark process and to illustrate how the construction of 
originates from the Gl-process, we first recall (see e.g. |29[ [T0| |6]) that in the GI- 
process the marks Mi{t), i = 1, . . . , N, were set to develop deterministically according 
to 

Mi{t) = M°+ / dMi{s) (2.1) 

I'D, 

= M°+ / f{M,{sy,e) + J2HMr{s),M,{s),X,,X,;e)ds, 

for t G [Bi^Di). Here /(■) controls the growth of radius i in absence of spatial 
competition (so-called open growth in forestry terminology), h{-) controls the spa- 
tial interaction between individual i and the other individuals and is a vector of 
parameters which controls /(•) and h{-) (see e.g. [T0| l29]). 

Here, however, in order to initialise the more realistic growth scenario where 
the marks have random growth patterns, as previously mentioned, we assume that 
each radius grows stochastically according to an a.s. positive stochastic process (note 
that an F- valued random variable/element is a stochastic process with cadlag sample 
paths). By calling t G [0, T] our global time and t — Bi our zth local time, in order to 
properly express Mi{t) through the global time scale, we will let the processes Yi(t), 
i = 1, . . . , N, where 

M(f\-I ^^^^-Bi) fortG[i?„A) ...^ 
^^*^"10 fort ^[5,, A), 

be given by a system of independent (time-shifted) CIR-processes (see e.g. |5|[T^I2U]) 



dYi{t) = \{l-Yi{t)/K)dt + a^Y^dWi{t), (2.3) 
Y,{t) = + j\{l-^^ds + j\^Y:{s)dW,{s), (2.4) 

with V"j(0) = M?, i = 1, . . . , A^, where the VFj(t)'s are independent standard Brownian 
motions. We note that this is equivalent to setting f{x) = A(l — x/K), h{-) = and 
adding a stochastic integral to expression (12. ip . 

The parameters (A, K, a) G 6a x 0^- x G^- C in expressions (12. 3p and (12. 4p 
control different aspects of the growth of a radius Mi{t) of a ball/disk Bxi [Mj(t)]: The 
diffusion coefficient a controls the magnitude of the random individual fluctuations 
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of the radii. The interpretation of the remaining two parameters becomes most 
clear by noticing that Yi{t) is a so called mean- reverting process, i.e. as Yi{t) starts 
to move away from its long term equilibrium K, the drift term starts pulling it 
back towards K and the speed at which this occurs is given by X/K. Related to 
this interpretation we find that if we set cr = in expression (12. 3p . we retrieve 
expression (12. ip with h{-) = (the Gl-process without interaction) or equivalently 
the ODE dYi{t) = A (1 - Yi{t)/K) dt. This ODE is often referred to as the linear 
growth function (see e.g. \27\ and in this setting the parameter A is referred 
to as the (individual) growth rate while the upper bound K often is referred to 
as the carrying capacity. In conclusion, $a/ is controlled by the parameter vector 
6 = {X, K,a,a, fi) G 9 = 6a x 6;^ x 9o- x 9^ x 6^ C M^, where the pair (a,/i) 
controls the time intervals during which the mark functions are non-zero and the 
remaining parameters control the growth of the marks. 

Regarding the initial size Mi{Bi) = Kj(0) = M^, a few different options are 
available. In |10| . the approach was to use the same constant initial value M^ = 
Mq G M+ for all individuals in the Gl-process, and in [29] the M°'s were chosen as 
independent t/m(0, e)-distributed random variables, e > 0. Here, however, we also 
have the further option to sample from the stationary distribution of Yi (see 
Section [3] for details) . 

2.2 Temporal evolution representation 

In order to stress that we here consider the temporal evolution of the SG- 
process (or rather the temporal evolution of the marks), we often write $A/(t) = 
(Mi(t), . . . ,MAr(t)), t > 0. Furthermore, in order to treat it properly we let it be 
adapted to some filtered probability space {X , J^, {J^t}te[o.T),^)- Specifically, the 
family of a-algebras {J-'t}tg[o,T] is such that, for any s < t, J^s J^t ^ ^ and, for 
each t G [0,T], $m(^) is J-i-measurable. 

We here will construct $A/(t) through two building blocks. The first building 
block is given by the underlying point process $(t), which can be constructed as 
spatio-temporal point process on x [0,T], and we call it a spatial immigration- 
death (SID) process. This process governs the assignment of the spatial locations of 
the individuals in W , as well as their arrival times and their lifetimes. The second 
building block, which may be regarded as an extension of $(t), is the set of F- valued 
functional marks (stochastic processes). We start by describing the underlying SID- 
process $(t). 

Let us consider the SID-process {$(t)}t6[o,r] which is a spatial birth-death pro- 
cess (see e.g. PH E]), taking values in the collection = {x. W : \x\ < oo} of 
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finite point configurations. It has birth rate function b{-,-) = a, death rate func- 
tion d{-,-) = fi and reference probabihty measure v{B) = u{B)/i'{W), B G B{W), 
where G Oq x C M^. Hence, it can easily be verified that the under- 

lying Markov jump process is given by a so-called immigration-death (ID) process 
(M/M/cx)-queue) {A^(t)}te[o,T] (see e.g. |9l [IH [17]) with arrival rate av{W) and 
death rate /i. Furthermore, we see that the spatial location kernel is such that 
all locations Xi e $(t) G A^'^, t E [0,T], are iid [/m(iy)-distributed. Looking 
closer at the ID-process, which is a time-homogeneous irreducible positive recur- 
rent Markov chain with state space N, we see that it can be used to describe 
a population where the "birth/arrival times" Bi < ... < Bn of the individu- 
als occur according to a Poisson process on [0,T] with intensity auCW) (whereby 
N ~ Poi{ai'{W)T)) and it generates "lifetimes" Li, . . . , L^r for the individuals which 
are iid i?xp(/i)-distributed. Hereby, by defining Di = [B^ + Lj) AT, i = 1, . . . ,N, 
we finalise the equivalence with the construction of the previously defined supports 
supp(Mi) = {te [0, T] : Mi{t) 7^ 0} = [Bi, A), ^ = 1, • • • , iV, of the functional marks. 
It is sometimes important to keep track of which individuals are alive/visible and we 
therefore define the index process Q(t) = {? G {1, . . . , A^} : t G [B^, Di)}, f2(0) = 0, 
which is a Markov process which controls which individuals are alive at time t (note 
that N{t) = |$(t)| = l^^(t)l)- We note that we just as well could have defined $(t) 
as a marked Poisson process on [0,T], with jump times i?i < . . . < Bn and marks 
(Li,X,), z = l,...,iV. 

We now turn to the second building block of $a/- Similarly to the previous 
scenario, the idea here is to consider the stochastic processes Mj(t) = Mi(t; $) = 
'^[Bi,Di){t)yi{t — Bi), i = 1, . . . , N, where the Fi(t)'s are defined in expressions fl2.3p 
and (12. 4p . Just as before the parameter vector will be given by ^ = (A, K, a, a, fi) G 

e = Ga X X X e„ X c m^. 

We note that under this representation, for each t G [0,T], we may treat $a/(^) 
as a (marginal) random vector of a multivariate (i-dimensional, d < N, diffusion 
process with jumps, for which the component processes are independent, stopped 
and time-shifted CIR-processes with jumps [d is controlled by the supports [Bi, Di), 
i = 1,...,N). Note that for the conditional process $M(t)|$, the randomness is 
present only in the li(t)'s. Moreover, we note that since u^W) < 00 and T < 00, we 
have that N < 00 a.s.. It is this representation of which mainly will be exploited 
in the statistical inference parts in the remainder of this paper. 
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3 Distributional properties of the SG-process and 
its components 

3.1 Properties of the CIR-process 

Given below are some results concerning different properties of the CIR-process and 
they can all be found in e.g. |8l |20]. The explicit solution of the CIR-process, which 
is given by 

Y,{t) = K-(K-MO)e-*^/^+a re(^-*)"/^V^rfiy,(s), 

Jo 

is obtained by applying Ito's formula with f{x,t) = xe^^^^ to the SDE (12. 3p . Fur- 
thermore, when 2A > o"^ the process a.s. stays strictly positive whereas it may reach 
zero otherwise. This condition, loosely speaking, says that the drift of the SDE 
must be large enough, in comparison to the diffusion term, to ensure that the mean- 
reversion is strong enough to keep the process a.s. positive. Hence, we will require 
that 2A > so that Mi{t) > for all t G [Bi, Di). 

Since Yi{t) is a Markov process, when we require that 2A > o"^, it is possible to 
derive explicit statements about the transition distributions, i.e. the distributions of 
the random variables Yi{t)\Yi{s), s < t. For instance, when s < t the conditional 
expectation and variance are given by 

ny'i{tm{s) = ys] = K-(i^-i/.)e-(*-^)^/^, (s.i) 

Var m)ms) = Vs) = Vs^ ^^^it-s)x/K _ ^-2it-s)x/K^ 

respectively. More interesting for our purposes, however, is that under the hypothesis 
that 2A > cr^ and s < t, conditional on Yi{s) = ys, the transition density of Yi[t) is 
given by the non-central ^^-distribution density 

PYAt-s,yt\ys;X,K,a) = ae-("+^) ^'^ (2v^) ' (^.2) 

where a = 2A/ {a'^K (l - e-^*"")^/^)) , m = ayse-^^-'^^/^^ , v = ayt and q = 2\/a^-l. 
The function Iq{x) = EfcLo(^/2)^*'^V^'r(fc + g + 1), x G M, where r(-) denotes the 
gamma function, is the modified Bessel function of the first kind of order q. 
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This ergodic process also has a stationary (invariant) distribution vr = 7ix^K,<r 
which is given by the Gamma distribution with shape parameter 2A/(T^ and scale 
parameter a^K/2X. Hereby, the density of the stationary distribution is given by 

-(x; A, K, a) = ^^^f,!TS^ e-(^V.^^), ^ > q, (3.3) 

so that vr has mean K and variance a'^K'^ /2\ and, moreover, for s < t, the covariance 
function of yi(t) is given by Cov {Yi{s),Yi{t)) = ^e-(*-^). 

As previously mentioned, Yi{t) is a Markov process and given that we start 
a Markov process in its stationary distribution, it is a strictly stationary pro- 
cess. In the case of Yi{t) this means that Yi{0) = ~ vr and its finite dimen- 
sional distributions (fdds) are shift invariant w.r.t. time, i.e. {Yi{Ti), . . . ,Yi{Tn)) ='^ 
{Yi{Ti + h), . . . , Yi(Tn + h)) for any set of times Ti < . . . < T„, any h > and any 
n eN. Hereby the marginal/transition distributions do not change, i.e. for any [s, t), 
t> s>0, Y,{t) ~ vr and Yi{t)\Yi{s) ~ vr. 

3.2 Properties of the ID-process 

Recall from Section [2l2] the underlying SID-process and its temporal component, the 
ID-process, {N(t)}^~^Q. The following result, which can be found in |9], gives us the 
transition probabilities and the stationary distribution of N{t). 

Lemma 3.1. The transition probabilities of the ID-process, N{t), are given as con- 
volutions of Poisson densities and Binomial densities such that, for h,t > and 
x,yeN, 

PNit, y\x; a, n) = {Ppoi(p) * PBin{x,e->^^)) {y) 

y 

= Ppoijp) {k)PBin{x,c~i^t){y - k), 

k=0 

where Ppoi{p){-) is the Poisson density with parameter p = a(l — e~^*)/yU and 
PBin{x,e-i^t'^{') is the Binomial density with parameters x ande~^^. 
Furthermore, the stationary distribution of N{t) is given by 

TTNi-) = nPot{a/p) G ■) 

and the expected value and second moment of the pN{t,y\x; a, p)- distribution are 
given by E[N{h + t)\N{h) = i] = ie-^'*+p and E[N^{h + t)\N{h) = i] = i{i - 
1) e"^^* +(1 + 2p)i + p, respectively. 

This lemma will be further exploited in Proposition 13. 11 where the fdds of $a/(^) 
are derived. 
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3.3 Finite dimensional distributions of the SG-process 

Consider now the SID-process or alternatively the index process Q{t) and the 

population size process N{t) = |$(t)| = Recall that these processes as well 

as the CIR-process are Markov processes, which in turn implies that also is 
a Markov process. This observation will be of great importance since in Proposition 
13. II the Markov property we will be exploited in the derivation of the fdds of 

In order to set the framework, we consider the (sample) times = Tq < Ti < 
. . . < T„ < T and the distribution of ($jvf (T^i), . . . , $M(7"n)) , when we are concerned 
with exactly, say, G {1, . . . , A^} individuals who appear at Ti, . . . , T„ (recall that N 
is the total number of individuals observed if we monitor the process continuously) . 
Furthermore, provided that the joint density of ($jv/(^i); ■ ■ ■ , ^M(^n))"^ exists, when 
evaluated at the size-time matrix 



''niu ■ ■ ■ min\ 
M= : •.. : 

\mdi ■ ■ ■ rridn ) 



we will denote this density by pTi,...,Tn(^', R should be emphasised that the ith 
row of M represents the evaluation-sizes of the ith individual under consideration, 
at the respective times Ti,...,T„. We further also note that if = 0, we are 
considering the case where the ith individual is not alive at time T^. Hence, if 
rriik = for all k = 1, — 1 < n, and mu > 0, we evaluate a scenario where 
Bi e (T;„i, Ti], and when mu > and rriik = for aW k = l + l, . . . ,n we consider Di G 
(Ti, T;+i]. Consequently, if a row were to contain only zeros, we would be considering 
an individual who is not alive at any of Ti, . . . , T„, whence that individual/row may 
be removed from consideration. 

The exact form of pri,...,T„(M; ^) is given in Proposition 13.11 and the main fea- 
ture exploited in its derivation is the Markovianity of $j\/(t). We note that the 
distribution of ($j\/(Ti), . . . , <l>M(^n))^ may be expressed through $m(^)'s transition 
probabilities/densities, which are given by 

P($M(t) e A|J-,;^) = P(<l>M(t)e A|$m(s);^), (3.4) 

where A = x . . . x Ad E i3(]R'^) and < s < t < T. The proof of Proposition O 
can be found in the Appendix. 

Proposition 3.1 (Fdds of $A/(t)). Given = Tq < Ti < . . . < < T and $Af (Tq), 
if we let M° = Mq > for all i, then the joint density of (<I>m(Ti), . . . , $A'/(^n))"^; 
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evaluated M G R"'^", d > 1, is given by 

n 

PT„...,T„(M;e) = CUpm [An,\oOk\\\oOk^i\;a'^iW),fi) (3.5) 



k=l 
n 

n Pyi(ATfc,mjfc|mi(fc_i); A,i^, cr) 

k=l iSLoif;_incjfc 
d „T, 



X 



j-j- / ' {Tk^ - mi(fc^_i) |Mo; A, K, a) 



where AT^ = — T^-i, = {i '■ rriik > 0}, k = 1, . . . ,n, and ki = min{/c : i G Uk}, 
i = 1, . . . ,d. The constant C = C{u{W), M) > can be found in expression M.^j) . 
and the densities Pyi{') o,nd Pn{') o-f^ given, respectively, by expression ^J7^ and 
Lemma \3.1\ 

Conditioning on $m(To) is reasonable since we in most applications already have 
all the information about the marked points present at the first sample time point. 
Note that if we choose all Mf fixed but not necessarily equal, expression (13. 5p only 
changes in that Mf replaces Mq. Furthermore, from the proof of Proposition 13. II we 
see that the transition probabilities (13.41) are obtained by finding 

P($A/(t) G A|$A/(s) = y;^) = / p$„(t)|ci,„(,)(m|y;e)dm, 

J A 

where A = Ai x . . . x G B{W^), m = (mi, . . . , rudY G M'^, y = (yi, . . . , ydY G M'^, 

P*MWI-fMw(m|y;^) = C{v{W),m,y)pN{t- s,\u{m)\ \uj{y)\-au{W),^i^ 

X W pYiit - s,mi\yi;X,K,a) 

iea;(y)na)(m) 



1 /■* 

Y\ / PYiit - v,mi\Mo] X,K,a)dv, 



X 

i6tj(y)'=ncj(m) 



a;(m) = {i : rrii > 0}, cj(y) = {i : yi > 0} and C(z/(iy), m, y) is a constant. 

As mentioned before, when M° ~ vr we have that Yi(t) is a strictly stationary 
process and this will have a further impact on the joint densities in Proposition 13. II 

Corollary 3.1. Given the preliminaries and notation of Proposition [O], by instead 
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assuming that M° ~ TT, the joint density Ii3. 5\) becomes 



n 



PTi,...,T„ Pn ATfc, \uk\\\uJk~i\;a'^{W),fi 



) 



k=l 



n 



X 



(3.6) 



k=l i&ijJk 



Proof. We note that all transition densities py. (AT^, - I-; A, i^, a) in expression f lA.3l) 
may be replaced by the stationary Gamma densities 7r(-; A, -fC, cr) of expression (13. 3p . 



Remark 3.1. VFe may additionally require that also N{t) starts in its stationary 
distribution tij^ (see Lemma \3. 1\) so that also N{t) becomes a strictly stationary pro- 
cess. Hereby the transition probabilities Pn{') expression 6]) will be replaced by 
T^N {\^k\] fi) . Note that this change will imply that N{t) = \^{t)\ ^ Poi{a/fi) 

for all t > and under this setup, since all Yi 's are stationary, we have that 
Mj(0) ~ TT for all individuals i G f2(0). 

Remark 3.2. As we previously noted, conditionally on Q{0) = 0, the process 
E{t) = [Ji^fK^fj Bxi[Mi{t)] at each fixed time t corresponds to a Boolean model (see 
e.g. f3^). The germs {Xi}iGn(t) <^i"^ generated from a Poisson process with intensity 
measure At{B) = ^(1 — e~^*)z/(i? fl W), B G S(M^), and the grains are given by 
{Bx,[Mi{t)]}i(zQ(^t), where allMi{t)'s are iid T {2 X/a'^,a'^K/2X)- distributed. Note that 
this follows since Q{t) can be generated as a thinned Poisson process (see f^). 

4 Maximum likelihood estimation 

Conditionally on ^m{To) = $a/(0), we now assume that we sample the SG- 
process $j\/(^) as M = (0i,...,0„) at the sample times Ti,...,T„ on some re- 
gion W. Here (pk = {rriik, ■ ■ ■ ,mdk)'^ , k = l,...,n, where d = |IJfc=i'^fc| ^^d 
ujk = {indices of individuals present at time T^} = {i : rriik > 0} (we may write 
(pk = (lwfe(l)"^ifc; • • • ! '^uiui-^)^dkY' to emphaslse the individuals' life status). Now, 
based on this sampling scheme we want to find the Maximum Likelihood (ML) esti- 
mate of the parameter vector 9 = (A, K, a, a, fi) G 6. 

We note that when $jv/ is treated as in Section 12. ![ i.e. as a functional 
marked point process instead of as an evolutionary process, the estimation based 
on the current sampling is equivalent to estimating a thinned version of the pro- 
cess. More specifically, this thinned version is such that all marked points A = 



□ 
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{[Xi, Mi] : [Bi, Di) n {Ti, . . . , T^} = 0} are removed and only the partial information 
{(Xj, Mj(Ti), . . . , Mj(T„)) : i ^ A} is available to estimate the actual structure of the 
non-thinned process $Af (think of this as a sample from the previously mentioned 
Boolean model which was based on solely the "alive individuals"). 

The likelihood function of the parameters of the SG-process, = „(^;M), is 
given by the joint density of ($jv/(^i); • • • ? ^M{Tn)), evaluated at M = (0i, . . . , 0„) 
and treated as a function of ^ G 0. Therefore, depending on whether we choose 
Mj(0) to be fixed or drawn from the stationary distribution, we end up evaluating 
either expression (13.51) or expression (13.61) when we evaluate n{0). 



4.1 ML-estimation: = Mq ^ IR+ 

When we let all ^i(O) = = Mq G M+ be given by the same fixed value, from 
expression (13. 5p we obtain 

n{e) = Ci,n{0)2AGh,n{G) l,n (^)2,n (^)3,n (^) , (4.1) 

where, for ki = min{A; : i G Uk} and AT^ = Tk — Tk-i, k = 1, . . . ,n, 



l,n 



(^) = n n Pyi(ATfc,mjfc|mj(fc_i); A,/r,cr) 

k=l iewfc-iHWfc 

ATfc 



2,n 



T-r 1 r"^'^^ 

= 11 / PYAt.mik,-i)\Mo;X,K,a)dt 

n 

3,n(6') = JJ Pat ATfc, |wfc| |a;fc_i|;az/(iy),yu^ . 
fc=i 

The (rescaled) log-likelihood is given by 

ln{e) = log(C-l„(^)) =l0gi,„,(^)+l0g2,„(^)+l0g3,„(^) 
= : /l,n(^)+/2,n(^)+/3,n(^), 

whereby the ML-estimator of ^ G O, based on ($j|,/(Ti), . . . , $m(21i)), will be given 

by 

en ■■= ^„(<fA/(Ti),...,<l>A/(T„)) 

= arg max/„(6'; $a/(Ti), . . . , $A/(r„)) 

= arg max(/i.„(e) + h^O) + /3,n,(^)). 
eee 
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We now want to express the ML-estimator 9^ = {Xn, Kn,an,C(n, IJ^n) as the sum of 
two estimators 6i^n and ^2,n which, respectively, handle the separate estimation of 
(A, K, a) and (a, fj,). We note that h,n{d)+h,n{0), which only involves A, K and a, will 
be maximized by any 6i^n = i^n, Kn,d'n,<y, fJ'), (a,/^) G K^- Similarly we have that 
h,n{d)i which only involves a and /i, will be maximized by 6'2,n = (A, -ft', o", 5„, 
for any (A, a) G M.^. Hence, in order for On = Oi^n + ^2,n to hold, we must require 
that Oi^ri = (An,-ft'n,5n,0,0) and 6'2,„ = (0, 0, 0, 5„,, i.e. 

On = Oi^n + 02,n (4.2) 

= argmax {h,n{0) + l2,n{0)} + arg max l3,n{0), 

and consequently we may estimate the parameters of the ID-process and the param- 
eters related to the mark growth separately. 

When the amount of data is large or when the AT^'s are small, we may consider 
instead the approximate ML-estimation where we set l2,n{0) = so that the only 
information about the diffusions comes from the observed transitions. This is rea- 
sonable since the amount of information about the actual parameter values which 
is carried by l2,n{d) is not really substantial (in comparison to /i „(6')). Moreover, 
since there is no closed form expression available for the ML-estimator (5„,/I„) of 
the ID-process (see j^), there is also no closed form available for On in (14. 2p . Hence, 
in modelling situations one has to rely on numerical methods to find On- 



4.2 ML-estimation: - tt 

Under the assumption that we start the diffusions in their stationary distributions, 
Mf ~ TT, from expression (13. 6p we obtain the likelihood function 

n{e) = Ci^n{0)2,n{0) OC l,n(^)2,n(^) (4.3) 

and the (rescaled) log-likelihood 

ln{0) = \og{C-\{0))=\og^,n{0)+\og2A0) 
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where, for AT^ = - Tfc_i, k = l,...,n, 

(n \ n 

Y\ Y\ '^i^ik: K,a)\ = ^ ^ log vr(mifc; A, K, a) 
k=l idijJk / k=l iGuJk 

h,n{d) = log I (^ATfc, \uk\ \uk-i\;au(W),^^ 

\k=l 

n 

= ^logpN (^ATk,\u!k\ \uJk-i\]au{W),fi^ 



k=l 



Here, just as in the fixed initial value case of Section 14.1^ we deal with the separate 
estimators 

On = kn + kn (4-4) 

= arg max h,n{0) + arg max l2,n{0) 
and, similarly, there is no closed form expression available for 6n- 

5 Asymptotic inference under stationarity 

When dealing with asymptotic spatial statistics, there are different types of asymp- 
totics which may be considered. 

In the case of the SG-process, within the framework of so called increasing do- 
main asymptotics (see e.g. |36]), there are essentially two different ways to increase 
the total number of individuals observed, and consequently also the number of tran- 
sitions taking place between pairs of consecutive sample times Tfc_i and T^. The first 
approach is to increase the number of sample times of the mark processes, i.e. we 
let n grow, whereby = T also will grow. The second approach is to gradually 
increase the size of the sampling window W (with the number of sample times fixed). 
The two approaches are similar since in both cases we increase the parameter of the 
Poisson distribution of ~ Poi{aTv{W)). Here, we choose to consider only the 
first of the two alternatives. 

Consider the situation where we, without loss of generality, let W = [0, 1]^ and 
apply the equidistant sampling scheme = kA, k = l,...,n, A > 0, where 
T = Tn = nA. In what follows we denote by 6q = (Aq, -ft"o! o'O; cto, /^o) ^ © the 
true/underlying parameter vector which is responsible for generating $jvf and we 
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assume that 6 is a subset of such that 

e n {(A, K, a, a, /i) G : 2A < a"^} 



(5.1) 



Recall that this is required to keep the l^(t)'s positive. 

In the theorems and corollaries below we give the strong consistency and the 
asymptotic normality of the ML-estimator. The proofs are given in the Appendix. 
The consistency proof follows the approach suggested by Wald [35] and the asymp- 
totic normality follows the lines of the classical approach of Cramer (see e.g. |15)). 

Theorem 5.1 (Consistency). Let Q he a compact subset o/M^ such that / (5. 1\] holds. 
Then, for G O, the estimator 6^ in expression Iji4-4\ l ^■^ strongly consistent, i.e. as 
n — oo, 



9a. 



Now, by putting some additional restrictions on the parameters we may also prove 
the following theorem. 

Theorem 5.2 (Asymptotic normality). Let 6q be an interior point of Q, where is 
a compact subset o/M^ such that Ii5. 1\) holds. Require further that 6q and A > are 
such that (log(a;o + /^o) ~ log(Q^o))//^o > 2A. 

Assume that Aq is known, so that 9^ = {Kn,d'n,an, fin) the ML-estimator of 
6q = {KQ,aQ,aQ, fio). Then, as n ^ oo, we obtain 



( 



N 







4x1; 



V 



ao 2Ao 



02x1 







MO 



ao 8AoC(eo) 
02x1 



Oi 
Oi 



x2 



x2 



\ 



J 



where C{9) = ^4'' (^) — 1 > 0, ^(x) = r'(x)/r(a;), r(-) is the gamma function, 
Oixj denotes the i x j zero matrix and the 2x2 matrix I^IOq)^^, which can be found 
in expression ( 15. 5]) . is the covariance matrix related to the ID-process. 

Similarly, when is known, we estimate 6q = (Aq; -K'o; "^O; A*o) by means of the 
ML-estimator 6^ = (\n,Kn,an, fin) and, as n ^ oo, we obtain 



( 



N 







4X1; 



MO 



«o 2C(eo) 



02x1 



MO ilofil 
ao 2Ao 

02x1 



Oi 
Oi 



x2 



x2 



\ 
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The Fisher information for the discretely sampled ID-process is given by 



In{Oo)ii /at (^0)12 

-?^Af(6'o)l2 lNido)22 



(5.2) 



where 



In{&o)ii 
In{0o)22 



-1) 



a 



2 ' 



r fn ^ _ (S - l)po(/ioA - To) - /ipA 
LN[fo 12 — 2 ' 

Po 



PO/iQ 



PoPo 



Po 



g-A/xQ-j^ and zis inverse is given by 



20(1 _ 



MO 



A((i+c-"oA)po(h_i)_i) 

/ po(2ro^MoA(l^c-^o^)) + ^(5-l)(ro~MoA)^ , po(H^l)(ro-^oA) 
; 7TV. i i : 



Mo A 



V 



1 _|_ Po(S-1)(to-/joA) 
^ MoA 



{=-l)(l-c-''oA) = 
Mo A 



(5.3) 



The reason that we require knowledge of either Aq or ctq in Theorem 15.21 is related 
to the over parametrization of the r(/3i, /32)-distribution, /5i = 2A/(T^, P2 = (J^K/2X. 
We note to begin with that P2 = K/fii and for a random variable Z ~ r(/3i,/32), 
by consulting expression (1A.9P in the Appendix, we obtain the related positive semi- 
definite singular (non-invertible) Fisher information 



iz{e) 



92 log7r(Z; A, Js:,a) 



2C{e) 







i,c(^)-i 



^ 



4A 

-r2 



Remark 5.1. From the proofs of Theorem \5.1\ and Theorem \5.2\ it may be seen that 
if we reduce li^O) to 



li,n{(^) = log n{mik] A, K, a), 

k=l i&Sjf, 

where uj^ ^ ujk = (^k iff = the proofs of the consistency and the asymptotic 
normality still go through (with obvious modifications). However, the convergence 
speed will be different as well as the Fisher information I{6q). An example of such a 



18 



reduction is to choose Uk = {(-^k,i}, ^-c we choose just one element from Uk- Another 
example of a reduction under which the results still hold is to consider the subsequence 
kn = n A A{n) , A{n) = X]fc=i \'^k\, md the reduction 

i=l 

6 Evaluation of the estimators 

We now turn to the numerical evaluation of our ML-estimators and precisely we are 
interested in investigating the asymptotic robustness of the stationarity assumption. 
This is carried out by assuming that the data is generated with some fixed = Mq 
and some Oq, while we instead are employing the estimator 6'„ in expression (14. 4p of 
Section 1121 i-e. the estimator based on the assumption that ~ vr, to estimate ^o- 
We then compare the behaviours of — 6'o| and |6'„ — ^qI- 

We first note that we from expression (13.11) may conclude that q-^^o/Ko _ 
\Ko-E[Yi{t)\Yi{0) = M^]\/\Ko-M^\. Clearly, if \K^-M^\ is small then Yi{t) quickly 
approaches its steady state Kq, whence the distance \6n — On\ between the two esti- 
mators should be reduced. The same should hold if (additionally) Aq is large, since 
under this condition the mean reversion is strong, which results in small deviations 
from the long term mean Kq. Similarly, if (Tq is small then the random fluctuations 
do not influence the growth as much as the drift coefficient Ao(l — Yi{t)/Ko) and 
hereby the drift becomes the main determining factor of the speed of convergence 
to Kq. We also note that if Hq is small then the expected lifetime of an individual, 
EeQ[Lj] = l//io, tends to be longer whereby we obtain more samples of Y'j(t) when it 
is close to its steady state Kq. 

We simulate 30 trajectories of $j\/(t) on W = [0, 1]^ and sample them discretely 
according to the sampling scheme Tk = k, k = 1, . . . , 100. Then, by using the 
stationary ML-estimator 6n of Section |42| we reestimate the parameters and compare 
the behaviour of On with the (non-stationary) ML-estimator 6'„ of Section HTTl We use 
different values for the parameters and to assess when \6n — 0o \ and \6n — 0o \ are 
small. We here only consider the estimation of Aq, Kq and (Jq since the performance 
of (d„,/i„) already has been evaluated in [U]. 

As we can see in Table [H as expected, in the case of On the main determining 
factor of the bias is the size of Aq, although the size of \K — M^\ certainly plays a 
role. It should also be noted that a higher ctq seems to imply a lower bias for 6"„. 
Furthermore, we see that 9n outperforms 6n in each case given in Table [H which is to 
be expected since 6'„ is the estimator which is based on the correct model assumption. 
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Note, however, that there are parameter choices for which even the performance of 
On is a bit poor. 



= 0.1 


A 


K 


a 


= 0.1 


A 


K 


a 


True (9ri) 


0.5 


5 


0.1 


True (On) 


0.5 


5 


0.1 


Mean 6'„ 


0.5027 


5.1698 


0.1086 


Mean On 


3.2856 


3.9807 


0.9761 


Bias On 


0.5% 


3.4% 


8.6% 


Bias On 


557.1% 


-20.4% 


876.1% 


S.e. On 


0.0605 


0.5623 


0.0139 


S.e. On 


1.3928 


0.1266 


0.2480 


M^ = 5 


A 


K 


a 


Ml> = 5 


A 


K 


a 


True (Oo) 


0.5 


5 


0.1 


True (6*0) 


0.5 


5 


0.1 


Mean On 


0.4241 


5.0385 


0.1006 


Mean On 


2.9054 


4.9987 


0.2185 


Bias On 


-15.2% 


0.8% 


0.6% 


Bias On 


481.1% 


-0.03% 


118.5% 


S.e. On 


0.1981 


0.4780 


0.0063 


S.e. On 


1.2660 


0.0539 


0.0540 


Mf = 0.1 


A 


K 


a 


M° = 0.1 


A 


K 


a 


True (Oo) 


3 


5 


0.1 


True (^0) 


3 


5 


0.1 


Mean On 


2.9950 


4.9926 


0.1036 


Mean On 


3.1261 


4.8713 


0.2425 


Bias On 


-0.2% 


-0.1% 


3.6% 


Bias On 


4.2% 


-2.6% 


142.5% 


S.e. On 


0.2196 


0.0708 


0.0121 


S.e. On 


1.2823 


0.0320 


0.0569 


M° = 0.1 


A 


K 


a 


M° = 0.1 


A 


K 


a 


True (6'o) 


3 


5 


0.5 


True (Oq) 


3 


5 


0.5 


Mean On 


2.9866 


5.0513 


0.4974 


Mean On 


2.7822 


4.9437 


0.5126 


Bias On 


-0.4% 


1.0% 


-0.5% 


Bias On 


-7.3% 


-1.1% 


2.5% 


S.e. 6'„ 


0.2883 


0.1505 


0.0226 


S.e. On 


1.2616 


0.1524 


0.1288 



Table 1: Parameter (re)estimation of Aq, Kq and ctq using the non-stationary ML- 
estimator On and the stationary ML-estimator On- The estimates are based on 30 
reahsations of {$M(^)}te [0,100] (non-stationary) sampled ai Tk = k, k = 1,...,100 
(with time discretisation step dt = 0.01) which are generated from ao = 0.5, no = 
0.01, W = [0, 1]^ and the above parameters {M^ and Oq). 



7 Modelling Scots pines 

As previously mentioned, the SG-process is constructed as a stochastic extension 
of the Gl-process, under the assumption that the interaction between the marks is 
negligible. Hence, when considering the Gl-process' main application area, which is 
the dynamical modelling of forest stands, it makes sense to employ the stationary 
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mark SG-process when we want to model a homogenous forest stand (trees of the 
same species with similar ages) where e.g. the distances between the trees are large 
(we may ignore the interaction). 

One data set which (arguably) may be considered to fulfil these requirements is 
the set of Swedish Scots pines considered in [10], which is illustrated in Figure [1] (all 
tree radii have been scaled by a factor of 10 for increased visibility). The spatial 
region W under consideration here is given by a circular region of radius 10 meters 
and the actual data set is given by a time series of marked point patterns, recorded 
at the years 1985, 1990 and 1996, where the approximate age of the forest stand in 
1985 was 22 years. Hereby we may set Ti = 22, T2 = 27 and T3 = 33, and we have 
iV(Ti) = 13, N{T2) = 26 and N{Ts) = 43. To be precise, for each T^, k = 1,2,3, 
each marked point pattern consists of measurements of radii (at breast height) m^fc 
and locations (stock centres) Xj G TV of the trees i G {j : nijk > 0} which are present 
at Tfc, and only trees having reached a radius of 0.005 meter are included in the data 
set. 



Pines 1985 Pines 1990 Pines 1996 




-10 -5 5 10 -10 -5 5 10 -10 -5 5 10 



Figure 1: Swedish Scots pines: plots recorded in 1985 (left), 1990 (middle) and 1996 
(right). The radii of the pines are scaled by a factor of 10. 

The approach used in [10] to model this data set was to employ the so-called 
logistic growth function f(Yi{t); 9) = XYi(t) (1 — Yi(t)/K) as individual/open growth 
function in (12. ip and the so-called area-interaction function h{-) (see expression (12. ip ) 
to describe the spatial interaction between the marked points. We note that both 
this individual growth function and the (linear growth function) drift coefficient 
f{Yi{t)]6) = A (1 — Yi{t)/K) in the CIR process are special cases of the so-called 
Von Bertalanffy-Chapman- Richards (VBCR) growth function (see e.g. [2^), whence 
their behaviours are quite similar. 

As previously mentioned, besides a and fi, the parameters under consideration 
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here are the growth rate A, the carrying capacity K and the diffusion parameter a. 
In Table |2] we find, together with the resuhs obtained in [TU], the results obtained 
after having fit the SG-process to the data set in Figure [T] Note that the choice 
= 0.005 has been made (in the non- stationary SG-process and in the Gl-process) 
since the trees in the data set have been measured only once they have grown to at 
least a radius (at breast height) of 0.05 meter. Regarding the estimation of (a,^), 
|10| obtained a = 0.0042 and /t = (based on the estimators given therein). Here, 
we obtain (a,/i) = (0.0613,0.7020) whence, once the forest stand has become old, 
we would expect av{W)/(i ~ 27 trees in W. 





A 


K 


a 


GI 


0.078 


0.095 




SG 


0.371 


0.073 


0.151 


Stationary SG 


1.269 


0.062 


0.218 



Table 2: Results obtained after fitting the stationary mark SG-process and the GI- 
process (results from [10]) to a data set of Swedish Scots pines. 

It comes as no surprise that K is larger in the Gl-process than in the SG-process. 
This follows since in the Gl-process, the estimation of the open growth (A and K 
in the logistic growth function) takes into account also that the observed sizes 
are results of an open growth which has been inhibited by spatial interaction, i.e. 
/(■) is inhibited by h{-). Since maxjfcmjfc = 0.0860 (see [10]) it is probable that 
the SG-process underestimates K a bit. Moreover, by comparing the results for 
the stationary and the non-stationary SG-process, we conclude that an increased a 
(larger fluctuations) for the stationary case also results in a stronger estimated mean 
reversion (increased A). 

From the differences in A and a for the two SG-processes we have indications 
that the data set has not (yet) reached stationarity, which is to be expected since 
the forest stand we are considering is quite young. 

In conclusion, mainly due to the difference in K between the SG-process and the 
Gl-process as well as the sensibility of having stochastic marks in the Gl-process, 
this pilot study certainly motivates a further investigation of the applicability of 
the full SGI-process, where we include an interaction function h{-) in the drift term 
of each diffusion Mi{t), t G [Bi,Di), i.e. where we add a stochastic integral term 
/J^' a{Mi{t))dWi{t) to expression fl^ . 
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8 Discussion 



We have here considered the Gl-process only in the context of the CIR-mark process, 
but we may just as well employ any other positive diffusion for the growth of the 
marks. As previously noted, the linear growth function, which is the drift function in 
the CIR-process, is a special case of the Von Bertalanffy-Chapman-Richards (VBCR) 
growth function (see e.g. Another special case of the VBCR growth function is 

the aforementioned logistic growth function f(Yi{t); 9) = XYi(t) (1 — Yi(t)/K) which 
has been used in the Gl-process in e.g. |10[ |28| [29] . 

A further modification which may be made is to change the diffusion term 
a{Yi(t);6) = ay^Yjf) into any other diffusion term which keeps Yi{t) positive, e.g. 
a{Yi{t);6) = aYi{t)'^, 7 > 0, which is the diffusion coefficient found in the CKLS- 
model (see e.g. |5]). Note that when applying these changes, we would typically not 
have known closed form expressions for the transition densities, py-(t, yi|yo; 6'). The 
transition densities are know only for a few special cases, including the CIR-process. 
Therefore, we have to use different approximated/pseudo likelihood methods for the 
estimation of the parameters (see [20] for a good general overview). 

Our final goal is to ML-estimate all parameters of the full SGI-process, i.e. to 
include also the spatial interaction function h{-) in expression (12. ip . Here the lack 
of closed form expressions for the transition densities remains and, just as for the 
previous adjustments suggested, the estimation requires that we employ approxi- 
mated/pseudo likelihood methods. For instance, jT] suggests an approach where the 
transition densities of multivariate diffusions may be approximated by series expan- 
sions based on hermite polynomials. Note further that within this setting, in order 
to reduce edge effects (absence of individuals outside the boundary of W) , it would 
be sensible to choose to be a torus. 

Also, thus far we have introduced the type of death which occurs when t > Di, 
i.e. the life-time of the individual has expired. Following the terminology of ^9], we 
can refer to this type of death as natural death. It is possible, however, to introduce 
another type of death, namely so called competitive death (or interactive death), 
and its introduction entails a slightly different formulation of the diffusions Mi{t), 
i = 1,. . . ,N. By defining the death-time of individual i to be (the stopping-time) 
(i = mi{t > Bi : Mi{t) = 0} A -Dj, we have that if Mj reaches the absorbing state 
Mi{t) = for some t G {Bi, Di) it stays and we say that it has suffered a competitive 
death. Furthermore, if it does not die from competition in [Bi, Di) it will still die at 
time Di, i.e. at its natural death time. As soon ast > Q the interaction between Mj(t) 
and the other marks will terminate, hence we remove individual i from consideration. 
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A Appendix: Proofs 
A.l Proof of Proposition 13.11 

The proof of Proposition 13.11 exploits the Markov property of ^nj{t). 

Proof of Proposition [X71 We first note that, by construction, when evaluated at M, 
we may express the the joint density of {^m{Ti), . . . ,^M{Tn))^ through its two 
building blocks - the CIR-process and the underlying process $(t). More specifically, 
we have that 

PTi,...,t„(M;^) = pa/(Ti,...,t„)|*(M; A, J'S:,a)p$(a;, {xi}ti;a,/i), 

where p$(u;, {xjjf^^; «, /x) is the density of (<I>(Ti), . . . , <I>(T„))^, evaluated at the 
index sets {Q{Ti) , . . . , Q{Tn))'^ = uj = (wi, . . . , a;„)'^, Uk = {i : rriik > 0}, 
and the locations {Xi)f^-j^ = (xj)f^^. We note that the total number of indi- 
viduals under consideration hereby is given hy d = |IJfe=i'^fcl- Furthermore, the 
density pM{Ti,...,Tn)\<s>{^] K, a) is the conditional density of the diffusions, given 
{Q{Ti) , . . . , Q{Tn))^ = u:, and we note that, probabilistically, this is a statement 
only about the diffusions Yi{t), . . . , Yd{t). 

We start by considering the part concerning the underlying process' behaviour at 
Ti, . . . , T„. The density ^$(0;, a, /i) can be further rewritten as the product 

p$(a;, {xi}^^^; a, /i) = p$|x(w; «, /i) x Px{{xi}i=i; z^(W^)), 

where p,^\x{^] f^) is the conditional density of (f2(Ti), . . . , Q{Tn))'^, given {Xi)f^^ = 
{xi)f^i, and Px{{xi}f=i; t^(W)) is the density of the locations {Xi)f^i. We thus con- 
clude that p$|x(u;; a, fi) is a statement only about which intervals [Bi, Di) that cover 
Ti,...,T„. Now, by letting di = \uji\, . . . ,dn = |wn| and recalling the ID-process 
N(t) = \ fl(t)\, when additionally conditioning p$|x(t<J; «, /i) on {N(Tk))^^-^ = ((ifc)^^j^, 
we obtain 

p$(a;,{xi},li;a,/i) = p#|x(^^; «, /u)px({a;i}li; i^(W')) (A.l) 

= P^\x,N{{uJk)k=i) PN{{dk)k=i; a, /i) 
xpx{{x,}U;u{W)), 



24 



where pisf{{dk)^=i', c^, fJ^) is the density of {N(Tk))k=i, evaluated at {dk)k=i, and 
P<!>\x,N{{i^k)k=i) is a statement about the order of appearance of the individuals in 
the index sets. 

Starting with the last of the components of expression (lA.ip . we clearly see 
that Px{{xi}f^i;i^{W)) = h'{W)~'^, since the Xj's are independent and uniformly 
distributed over W. Moreover, from the Markov property of N{t) we have that 
PNi{dk)k=i', may be written as a product of its transition densities, i.e. 



Piv((4)fc=i;«)/^) = (ATk,dk dk-i;au{W),n 



k=l 



where AT^ = — Tk_i and p^it, y\x] auCW), fi) is given by Lemma [3.11 Regarding 
the first of the parts constituting expression flA.ip . it is given by p<s>\x,N{{^k)k=i) — 
l/\V\, where 



ni 



V = luj : ui,...,uJn^{l,...,d} = (Juk]\u}i\=di,...,\un\=d 

I k=l 

i G OJk^i, i ^ UJk ^ i ^ t^fc+i for any k = 2, . . . ,n — 1 

This can be seen by considering the matrix A = [ai^k]i=i,...,d; k=i,...,n, which has entries 
ai,k = HTk e [Si, A)}. When we condition on (iV(rfc))Li = (4)Li we only 
specify that the column-sums of A are given by di, . . . , dn, whence we still have to 
determine what the probability is of A being observed as the matrix A with entries 
cii^k = ^{f^ik > 0}. It may be seen that the sample space of the conditional random 
matrix A\{[N{Tk))^^i = (4)^=1) is given by the d x n-matrices which have 0-1 
entries, column-sums di, . . . ,dn and all I's in each row connected (this follows since 
individuals cannot start living again once they have died); when n = 5, say, a row 
can be given by e.g. (0,1,1,0,0) or (1,1,1,1,0). To obtain a bound for \V\ we 
find that the number of matrices which have rows with connected I's is given by 
dY.]=iin - U - 1)) = dn{n + l)/2, whereby 2/dn{n + 1) < 1/\V\ < 1. 
Hence, we may summarise expression (lA.ip as 



p^{u},{xi}i^^;a, n) = CY\_Pn (^^Tk,dk dk-i;au{W), 

k=l 

where the constant 

C = uiWy'^lVl-^ (A.2) 
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depends on z/(iy) and M (in particular d, di, . . . , dn)- 

We now turn to the part of the density which is related to the mark processes. 
Due to the independence of the marks, the Markovianity of the CIR-process and the 
uniformly distributed birth times, we have that 

n 

PAf(Ti,...,T„)|$(M; A,K,cr) = JJ Yl PYA^Tk,mik\mi^k-iy,>^,K,a) 

k=i ieujk-ir^'^k 

TT PYAn,-t,m.,,^-.,\M,;X,K,a)^^^ ^^^^^ 



X 



where pYiit — s,yt\ys', X, K, a) is given by expression (13.21) and ki = mm{k : i G Uk}- 
The first part of the above expression includes all considered transition time pairs 
(Tfc_i, Tfc) whereas the second part includes the transitions between the (unobserved) 
arrival time Bi G (Tfc.„i,Tfe-) and the first sample time T^. at which the individual is 
observed. 

Although there is no information available regarding the exact death times and 
death sizes, it could be argued that the (unobserved) death sizes have been ignored 
in expression (lA.3p . Letting ki = max{A; : i G Uk} be the index of the last sample 
time point at which the individual was alive and T^.+i = oo if k^ = n, we obtain 

py,{t - n^,x\m^- \K,a)_^^^^ ^ r%^^ I ^ ^ 



as (possible) contribution to expression (lA.Sp . Hence, the death sizes may be ne- 
glected. 

□ 



A. 2 Proofs of Theorem 15.11 and Theorem 15.21 

The consistency is shown by using the classical Wald approach and the asymptotic 
normality proof follows the approach of Cramer. 

Before turning to the proofs, we first note some (well known) results used in 
the proofs of the consistency and the asymptotic normality of the sequence of ML- 
estimators On- 

The following lemma, which can be found in [15], will be used in both the con- 
sistency proof and the proof of the asymptotic normality. 

Lemma A.l (Uniform Strong Law of Large Numbers). Given that Zi, Z2, . . . are 

iid copies of the random variable Z , assume that: 
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(i) 6 is compact, 

(a) U (x, 9) is upper semi- continuous in 9 for all x and there exists a function K{x) 
such that E[i^'(X)] < oo and U{x, 9) < K{x) for all x and 9, 

(Hi) for all 9 and for all sufficiently small p > 0, sup^.^^g/ g-^^p U (x, 9') is measurable 
in X. 

Then 

P (limsupsup- < sup/i(^) I = 1, 
Y n^oo 6*60 n 6»ee j 

where fi{9) = E,[U{Z,9)]. If we replace (ii) and (Hi) by 

(a)' U{x, 9) is continuous in 9 for all x and there exists a function K{x) such that 
E[K{X)] < oo and \U{x,9)\ < K{x) for all x and 9, 

we obtain instead 




A further convergence lemma, Slutsky's lemma, which can be found in e.g. |15| . 
is used both in the consistency proof and in the proof of the asymptotic normality. 
It combines converging stochastic sequences. 

The Lindeberg- Feller central limit theorem (see e.g. [33]), which we will exploit in 
the proof of Theorem 15. 2[ gives us the asymptotic normality of sums of independent 
random vectors which are not necessarily identically distributed. 

We here (partly) will consider a stronger version of the Lindeberg-Feller central 
limit theorem, which is given by a multivariate form of the Lyapunov central limit 
theorem, which can be found in e.g. [19] (it is stronger in the sense that if the 
Lyapunov condition holds then the Lindeberg condition is satisfied (see e.g. |18))). 

Proof of Theorem \5.1\ From [^ we already have that the ML-estimator of the ID- 
process is strongly consistent, i.e. Pn) (ttO;/^o); as — )■ oo. Hence, if we 
manage to show that (A„, Kn, (3"„) —4 (Aq, Kq, ctq), as n — )■ oo, then Slutsky's lemma 
(see e.g. |l5]) gives us that 9n 9o. 

To simplify the notation we write 6 for Qx x Qk x ©o- so that 9 = (A, K, a) G G, 
and it now remains to show that 9n = (A„, Kn, <5"n) (Ao, Kq, ctq) = ^o, as n — oo. 
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The idea of the proof is to show that, for any 5 > 0, if we assume that 6'„ G {0 G : 
d{(^i do) > (5} we get a contradiction. 

By suppressing all conditioning in the iid random variables M((Tfc)|Mj(Tfc_i) and 
relabeling the A{n) = X]fc=i observations of the Mj(T/c)'s up to time T„ as 

Zi, . . . , ZA{n)i we may write the log likelihood as 

A{n) 

ll,rm = Y.^:{0), (A.4) 



i=l 



k{9) = log7r(Z,;e)=log 



2-2V'^'-lg-^z(2A/a2A') 



= (2A/(t2 - 1) logZ, - Z,{2\/a'^K) - \ogV{2\/a'^) 
-{2\/a^) (21og{a} + log{A'} - log{2} - log{A}) . 

We note further that A{n) is non-decreasing in n. Since the total number of individu- 
als is given by ~ Poi{anA), we have that —4 oo, as n — )■ oo, and by the strong 
law of large numbers we have that j;^ Yl!i=i > ^} E[l{Li > A}] = e"^'^ > 
0, as — )■ oo, for any (a, /x) G x B^. Hence, as n — )■ oo, we will observe the sta- 
tionary diffusions Yi{t) an infinite number of times at our sampling times Ti, . . . , T„, 
whence A{n) —4 oo. Note that if we for instance choose to include only one obser- 
vation of each diffusion in flA.4p (say the last one), i.e. A{n) = ||J^^-|^ r2(A;A)|, the 
convergence would still hold. 

Treated as a function of 9, we note that the MLE 9n also maximises -^^h,n{9) — 

.A(n) 



A(n)h,ni(^o) = Zli=i (^i(^) " hi9o)). Consider now the function 



U{x,9) := \og7r{x]9) -\og7i{x]9o) (A.5) 



a, 



2 ( -2 - i log(x) -2 -— ) x + log 



2Ao 



+ (2Ao/ao') (21og{ao} + log{Ko} - log{2} - log{Ao}) 
- {2X1 a^) (21og{a} + log{A'} - log{2} - log{A}) , 

which clearly is continuous in both arguments, and thereby a measurable function 
of X. 

Identifiability: The only way in which expression (lA.SP can be set equal to is to 



require that 9 = 9q. Hence, {vr(-; 9) : 9 E 0} is an identifiable family of distributions, 
and this in turn guarantees that the ML-estimator converges to the unique maximum 
9q (the uniqueness follows from e.g. Lemma 5.35, p. 62, in |33]). 
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Conditions of Lemma lA. 11 First we note that the continuity of U {x, 9) imphes 
that it is upper semi-continuous in 9 for all x. Now, in order to find the required 
bound K{x), we note that 

U{x,9) < 2sup|log7r(x;e)| 
6»ee 

< 2 (^^ + 1^ I log(x) I + ^x + 2 log \2\/g_^] ! 
4A / 

+ ^ sup I log{A}| + sup I log{ir}| + 2 sup I log{a}| + log{2} 



=■■ K{x), 

where e.g. A = sup{6a}, £ = inf{9o-} and \x] = inf{?/ G N : x < y} is the 
ceiling function. Because of the boundedness (compactness) of 6 we have that 
Eq;, < oo since also Egp[Zi] = Kq and (by Jensen's inequality and Appendix 



EeJ|log(Z,)|] = E,„[v/MZ:p]< v/E,Jlog(Z,)2] 

2Ao V ^, 2Ao \ , /2Ao 



/ 2Ao\ , , / / 2Ao 



where ipi^) = T'{x)/T{x). 

By the measurability of U (x, 9), we now may define the continuous function n{9), 
which by [23] is given by 

fi{9) = Ee,[U{Z,,9)] = -D{7reo\\7re) 

2Ao\ /2Ao 2A\ 2Ao , T{2X/a^) 



^oV V^o ^0 r(2Ao/a, 



2A, 

+ — log 



f2Xo 


2y 


\ 2Ao 












2XKo 









where Dijie^ | Ittq) is the KuUback-Leibler divergence between the the two distributions 
involved. 

From the Shannon- Kolmogorov inequality we have that ^{9) = — D(7re„ | Ivrgi) < 0, 
with fi{9) = —D{-^^o^-,\\'^^e) = iff ^ = ^o- Hence, by letting 6 > and defining the 
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compact set C = {6' G : d{6,6o) > 6}, we note that by its continuity, fi{6) attains 
its (negative) maximum on C, i.e. /i = supgg^ fi{9) < 0. 

Since suprema of measurable functions are measurable we get that, for any SCO, 
the function supgg^ U (x; 6) is measurable in x. 

Argument: Now, by Lemma [A. II we have that 



-5^f/(Z,,^)<^j =1, 



P I lim sup sup 

n-s>oo 6»eC A[ 

SO that there a.s. exists an n* G N such that for all n > n*, 

say. But at the same time, since Yl,^=i U{Zj, 9q) = 0, we must have that 

A{n) ^ Ain) 

Hence, for n > n*, this implies that On ^ C a.s., or equivalently that diOniOo) < 6. 
Since 6 > was arbitrarily chosen we get that ^„ —4 6*0 as n — oo. 

Note that we always can find a measurable selection 6'(x) such that /„(x; 6{x)) = 
supggQ Zn,(x; 6*), for all x (see e.g. [U])) whence the measurability of 9n never is ad- 
dressed. 

□ 

Proof of Theorem \5.S\ Given 9^ = (Xn, Kn,an,an, fi'n)'^ and 6q = 
{Xq, KQ,ao,ao, Ho)'^ , we are here interested in the asymptotic distribution of 
\/n{6n — Oq)- Since either Aq or ctq is known, we will be dealing with the asymptotic 
distributions of 

(i) y/n{9n -Oq) = ^/n {kn - Kq, an - (Jo, an - "0, An - yUo) , 



^ii) y/n{9n -Oq) = ^/n (^Xn - Ao, Kn - Ko, an - ao, fin - /io] 



T 



Unless necessary, we will not distinguish in the notation between the two scenarios 
above, and in what follows we will prove that, when n oo, the random vector 
\/n{9n — Oq) will asymptotically have a Gaussian distribution. 
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From the consistency proof we recall Zi = Mj(Tfc)|Mj(Tfc_i), A^^ = |f2(A;A)|, 
^{n) = X]fc=i and the (parameter reduced) log-likelihood function 

Ue) = /„(^;<I>m(A),...,$m(^A)) 

= /i,„(^;(Z,)fJ;'^)+/2,„(^;(iV.)r=i) 

n n 
k=l k=l 

where Zk = (^i)ien(fcA) and 

ien{kA) 

j Iy{K, (t; Zi) = log7r(Zj; Aq, K, a) if Aq is known 
[ /y (A, Zj) = log 7r(Zj; A, /T, ctq) if ctq is known, 

log Pat (A,iVfc|iVfc_i;a,/i) . 

The indexation of the Zj's will have two meanings, which will be clear from the 
notation: We either deal with (^i)ien(fcA); k = 1, . . . ,n, or {Zi, . . . , ZA(n))- Note that 
although we simply write /y(6'; Z^), there is still a dependence of {f2(/cA)}^^^ present 
(recall the construction of {6)). 

We will here denote hy Sp = {6 & <d : d{6, 9q) < p} the closed neighbourhood of 
6q with radius p > and we note that the consistency holds also for this (restricted) 
compact parameter space. Now, given the conditions under which ^„ was proved 
strongly consistent, we get that 6n is a strongly consistent sequence of roots of the 
likelihood equation in{d) = ^^^^^ = O4X1, where Ojxj denotes the i x j zero matrix, 
i.e. /„(^„) = O4X1 (see e.g. thm^ 18, p. 121, [15]). 

As we shall see, the vector ln{d) and the 4x4 matrix ln{G) = d"^ ln{0) / 06"^ are well 
behaved enough to Taylor expand ln{G) around 

Ue) 

or 

V^{0 - Oo) 
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Iy{0;Z,) = 



in{Oo)+ / Udo + x{e-9o))dxie-9o), 

Jo 



-1 



n 



n 



1 



-B^{ey'—{L%)-UG)). 



We wish to prove that, when evaluated at 6 = 6n, the right hand side of the last row 
of the above expression converges in law to a zero mean multivariate normal distri- 
bution. By managing to show that —Bn{9n) — I{9o), then eventually —B~^{6n) 
will exist, given that the inverse of the (asymptotic) Fisher information I{9q) exists. 

Then, once we have shown that /'n(6'o)/v^ ~^ G ~ N {Q,I{9q)) as n ^ oo, by 
means of Slutsky's lemma (see e.g. |15]) we may establish that 



In 



(A.6) 



J(^o)-'G~iV(0,J(^o)-'). 



The 4x1 vector /„(6') of first order partial derivatives found in expression (1A.6P 
is given by 



k=l 



k=l 



k=i ien{kA) 



Iy{9;z) 
iNiO;y,x) 



k=l 



dlY{0;z) dlY{9;z) ^ 

dK ' da 'Ulx2( li AO 



if An is known 



^^,0ix2) if ao is known. 



ax 



Oi 



x2, 



dlN{0\y,x) dlNiO;y,x) 



da 



where the elements of /y (^; z) can be found in expression (1A.9P and those of In{0', U, x) 
are given in [9]. In the integral expression, Bn{0), of flA.6p we also find the symmetric 
matrix of second order partial derivatives 



ln{0) = /l,nW+^2,n(^) = 5^/Y(^;Zfc) + ^/^(^;iVfc,iVfc_l) 

n 



k=l 



where 



k=l ien(fcA) 



k=l 



k=l 



^02x2 


02x1 


02X1 \ 


Olx2 






do? 


dadfi 


\0ix2 






dfida 
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and 



lyid', z) = l{Ao known} 



( dHY(d\z) dHY(B\z) 

dK^ dfjdK 
d^lY{e;z) d^lY{e;z) 



\ 



dcjdK 
O2XI 



O2XI 



0lx2^ 

0lx2 
02x2 J 



(A.7) 



I ^ Ulx2 > 



+l{cro known} 



v 



aA2 

d'^lY{d\z) 
dKdX 
O2XI 



dKdX 
dHY{e;z) 

dK^ Uix2 

O2XI 02x2/ 



and the elements of Iy{9\z) and li\f{6;y,x) are given, respectively, by expression 
(lA.Op and [H]- By writing the parameter vector as ^ = (^1,^25^35^4) and consulting 
expression (lA.Op . then for all j,k = 1,2, we can find bounds such that, for constants 
Cj^i,Cj^2,Cj,3 < 00 and Cjk^i,Cjk,2,Cjk,3 < 00 which depend on 6 (or alternatively 



on 5", 



dwie-z) 


< 




dlY{e-z) 




sup 








< 






< 




dHy{e-z) 




sup 








< 




z) = Cjk,i 



{Z) + Ujk,2Z + Ojfc,3, 

Ee[i^'j(Zi)] < 00 and E,g[Kjk{Zi)] < 00 (recall the finiteness of E6)[| log(Zj)|] from the 
consistency proof). 

Note further that differentiation under the integral sign always is permitted (i.e. 
we may interchange differential operators and integrals/expected values) since the 
Gamma-distribution belongs to the exponential family. Since all partial derivatives 
of lyiO] z) above are continuous functions, it follows that both ly {6; Zi) and ly {9; Zj) 
are measurable (i.e. random variables). 

Convergence of Ini^o) / V^'- We now return to expression (1A.6I) and consider the 
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weak convergence 



1 • 



n / 

k=l \ ^ ien(feA) 



)+^/jv(^o;iVfc,iVfc-i) 
n 



— Z¥(6';Zfc) 

G~iV(0,/(^o)) =iV( 04x1, 



lyido) 02x2 
02x2 In{(^0, 



as n ^ oo, which we will prove by means of Lindeberg-Feller CLT (see e.g. |33] ) 
as opposed to the usual central limit theorem (since the component of the sum in 
expression flA.SP are not identically distributed). 

We start by showing that the (asymptotic) mean of ln{0o)/y/n is zero. Since 
we may interchange derivatives and expectations in the case of 7i{z]6), and since 
J T[{z\ 9Q)dz = 1, we have that 



E, 



Bo 



dn{z;9o)/d9o 



7r{z] 9Q)dz = 



7r(z; 9o)dz 



whereby [-^Zi „(^o)] = 0. That also E5ig[^/2,n(6'o)] = follows since from [9] we 
have that Ee„ [/';v(^o; N„ N,_i)] = 0, and thus EeJ^/n(^o)] = 0. 

Turning now to the covariance matrix I{9q) of expression (lA.Sp . from the Tower 
property of conditional expectations we further obtain that 



Gov ( ^/i,„(6'o), -^i2,nido] 



n n I \ 

/y(eo;^i),/7v(^o;iVfc,iVfc-i) 



'6*0 



0, 
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which in turn imphes that 



1 • 



\ , 2,ny^0, 

n 



However, from [9] in combination with [12]) we already have that the non-zero compo- 
nents of Var^Q [l2,n{0o) / y/n) converge to the Fisher information In{(^o) of expression 
(15. 2p . Hence, in order to find the Fisher information I{6o) of expression ( lA.Sp . it now 
only remains to show that the non-zero elements of Var^g {h,n{do) / y/n) converge to 
/y(^o), as n ^ cx). By exploiting that Ee,[N{kA)\N{0) = 0]'= ^(l - e'^^^'^ ) (see 
Lemma ISTTl) and by noticing that lim^^oo ^ X]fc=i ~ e"'^"'^^) = 1, we obtain 



Vareo —=li^n{Oo, 
'n 



k=l 



1 " 

k=l 
1 " 



k=l 



=/z(eo) 



1 " 

k=l 
1 " 



Through expressions (IA.7P and flA.9p . it can now be checked that the above 
covariance matrix is given by 



Iy{Oo) = { 



/y(6'o) 02x2 
02x2 02x2 

! 2Ao 
g 8AoC(eo) 
2C(eo) Q 



if Ao is known 
if (To is known. 



35 



where C{9o) = ^ip' (^^j - 1 > and the positive definiteness of /y(6'o) follows 

since clearly y^/y(^o)y = l/i^y(^o)ii + ylW{0o)22 > 0, for any y e y ^ 0. 

In order to finalise the convergence of expression IKM it now only remains to 
show that the convergence to a Gaussian law holds. Explicitly, by means of the 
Lindeberg-Feller theorem (see e.g. [33]), we we want to show that, as n — ?■ oo, the 
sum -j^in{Oo) = Y2=iiXk + Rk), where 

Xk + Rk = —^h {0] Zfc) H i=iN{Oo; Nk, Nk-i) 



^ V iyi9o;Z,) + ^iMi9o;Nk,Nk-i 



ief2(fcA) ^ 

converges in law to the Gaussian distribution in expression (lA.Sj) . In order to do so 
we need to show that the Lindeberg condition of the Lindeberg-Feller CLT is satisfied 
for the sequence Xk^ Rk-, i-e. for every e > 0, 

n 

5(n) = ^E,„ [|Xfe + i?fe|'l{|Xfe + i?fe| >£}] ^0, 
fc=i 

as n — >■ oo. 

We note that by the triangle inequality, 

|-^fc + -RfcP < (|-^fc| + l-Rfcl)^ = l-^fcp + l-RfcP + 2|Xfc||i?fc|, 
\{\Xk^Rk\>e] < l{\Xk\ + \Rk\>e} 

< l{\Xk\> e/2} + l{\Rk\> e/2}, 

and since Cov(Xfc, Rk) = and E^j, > e}] = P (|-Ra;| > ^), we obtain that 

n n 

S{n) < Y,^8o [\Xk\^H\Xk\ > s/2}] + 5^E,„ [\Rk\H{\Rk\ > s/2}] 

k=l k=l 
n n 

+ ^Ee„ [\Xk\']F{\Rk\>e/2) + J2^eo [\Rk\^]F {\Xk\ > e/2) 

k=l k=l 

n 

+2^E,, [\Rkme, [\Xk\l{\Xk\>e/2}] 

k=l 



+2^E9„ [\Xk\]¥.e, [\Rk\l{\Rk\ > e/2}] 



k=l 



Si{n) + S^in) + Ss{n) + S,{n) + S,{n) + Se{n). 
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We now want to show that each of the six sums above converges to zero as n 
tends to zero. We see that checking that the sum S2{n) tends to zero, as n — )■ oo, is 
to check that the Lindeberg condition is satisfied for the equivalent convergence in 
the discretely sampled ID-process, and this already holds (see the combination of j9] 
and [12]). 

Considering the first of the sums, Si{n), instead of proving that Si{n) — we will 
prove the stronger Lyapunov condition of the Lyapunov central limit theorem (see e.g. 

[19]) for the random variables Xk- Denote by := E[|X|p]p the Lp{X,J^,F)- 

norm of the random variable X and recall the bounds Kj{Zi), j = 1,2, of the 
elements of /y(6'o; Zi). Given p > 2 and any y G M^, by the conditional version of 
Minkowski's inequality, we have that 

1 



^jen(feA) 3=1 

2 



< 



E Ei%rE.,.[A-,(Zi)'': 



^00 




Iy{Oq\ Z, 



ieQ(kA) j=l 

which in turn implies that 



E, 



90 



y^X,nQ{kA) 



< 



< 



1 



We will now deal with this expression when p = 4. In the case of p = 4 the finiteness 
of Eqq [Kj{Ziy] holds for any compact parameter space 6 (alternatively Sp) since 
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Egj, [log(Zi)^] and Eg^, [Z^] are finite (their expressions can be found in Section [A. 31 
in tlie Appendix). By furtlier also noticing tliat Yl'j=i lUjl'^ < ^ 
recalling that ^ X]fc=i ~ e"^''^'^) — ?■ 1, when choosing p = 4 and letting n — oo, we 
finally obtain that the right hand side of the above expression tends to zero, whereby 

[\'y'^-^k\^] ~^ 0- Hence, the Lyapunov condition X]fc=i^fo |y^^fc|^ — )■ is 
satisfied, and hereby the Lindeberg condition Si{n) — )■ follows. 

The next convergence proved is lim„_j.oo 5*3 (n) = 0. From the above derivations, 
we now additionally see that 



n 2 1 " 

hm Y^Ee,, [\X,\'] < ^Y.^eo [K.iZ,)'] lim -^^(1 -e 



k=l 



^5^E,„ [K,{Z,r] <oc. 



2 



k=l 



Given p{k) = ^(1 - e'^o'^^), recall from Lemma O that E[N{h + t)\N{h) = i] = 
ie-^'^+p and E[N'^{h + t)\N{h) = i] = i{i - l)e-2Mi+(i + 2p)ie-^'* + p. By 
exploiting Markov's inequality and considering bounds given in [9], for each k = 
1, 



, ra, we have that 



1 



F{\R,\>e/2)<—Eg,[\Rk\] 
= -j^Ee,[\iNiOo;Nk,Nk-i) 



< 



-Eo 



2 A fEe,[Nk 



"0 



^/n6 \ Aao 
2A 



1 + 



gpA^ + (3iVfc + iVfc„i)A ' 

X _ g-/ioA 

apA + {3Ee,[Nk] +Ee,[Nk^i]) 
A + ^(3(l-e-'^o^^) + 



2A 



A/io 



+ 1 + ao 



-MO A 



< 



A + ^ 

MO 



^/ne \ Apo 1 - e-^'o^ 

as n ^ oo, whence sup;^<fc<„ P (|-Rfc| > s) 0. It now readily follows that 

n 

lim SsH < lim sup F{\Rk\ > e) Ve,„ [\X,\'] = 0. 



n^oo x<k<n 



k=l 
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We note further that from the above arguments it also follows that Eg^, [|Xfe|] < oo 
and EgJIi^fcl] < oo. 
We now turn to 



k=l 



From Lemma 13.11 we find that 
Ee,[NkNk-i] 



Ee,[Eeo[Nk\Nk-i]Nk-i] 
p{k - l)(p(A; - 1) + 1) e-^'''^^ +p{k)p{k - 1) 



< 2 



"0 / tto 



+ 1 



and since p{k) < ao/fiQ for any k, we note that according to j9] we have that 



n 



6o 



«0 



' , /aoA2 + (3iV, + iV,_i)A^' 



^0 



, 2aoA3(3EeJiVfc]+EeJiVfc_i]) 



(1 _e-M^)^ 



A2(9EeJiV|] + 6Eg,[NkNk-i] + E^JiVt J) 



(1 _e-MoA)5 



11 1 Oq f Oq 



<- —- - + ! 



A^ 



(1 _ e-/^oA)2 



+ 



SttoA^^ + A222sa + 1 

^ Mo Mo \Mo 

(1 -e-^'oA)2 



-G 



n ■ 



2,R ■ 



<oo 



By Markov's inequality we have that 

¥{\X,\>e/2) < ^EeMkl] 



< 
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whereby, as n — oo, it follows that 



SJn) < 



<oo 



Dealing finally with the convergence to zero of the last two sums, by recalling the 
bound Eeg[|i?fc|] < -^Ci^ji, we now see that 

n 

S,{n) = 2^EeJ|i?fc|]E,„[|X,|l{|Xfc| >£/2}] 



k=l 



< 



\ n ^ 



■do 



k=l 



e/2 



■l{\Xk\>e/2} 



\fne 



as — oo. Similarly, as — oo, we also obtain 

n 

= 2Y,^eo[\XkmeA\Rk\H\Rk\>e/2}] 

2 



< 



k=l 

4 tto 



n 

X 5^(1 -e-'^»'=^)Ee„ [\Rk\H{\Rk\ > e/2}] 



k=l 



< 



y/n EflQ ^ 



n] 



0. 



<oo 



Hence, as e > was arbitrary, we conclude that lim,„^oo S{n) = 0, and hereby 
the convergence of expression (1A.8|) follows. 

The a.s. convergence of —Bn{0n) to I{9q): We first note that by writing 



py(^o) 


02x2\ 


/ 02x2 


02x2 \ 


\ 02x2 


02x2 / 


\O2x2 
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we have that 

-Br, (9, 



< 



in% + x{e^-e,))dx + i%) 
1 1 " 

n ^-^ 

k=l 

/•I 1 " 



n[Uq + X 

k=l 

Dyin) + DNin). 



- 9o); Nk, Nk-i)dx + I 



Denote by /„(ao,/io) the ID-process hkehhood and by {an, fin) the remainder 
term used in the Taylor expansion of the corresponding proof of the asymptotic 
normahty in j9]; 

n {{an, fin) - («o, /^o))^ = {-Bn{an, fin)) ^ ^/„(ao, /^o)- 



That D]\f{n) 0, or equivalently that —Bn {an, fin) -—^ In{So)^ as n — )■ oo, then 
follows from j9] (in combination with |12) (Theorem 2)). 

Hence, it now only remains to show that -Dy(n) —4 0. From the dominated 
convergence theorem (recall the continuity and the bounds of Iy{Oo',z)) we obtain 
the 6'-continuity of Egg [Iy{0', -^j)] , whereby the continuity of 7(^) := ^Eg,, [/y (6*; Z^)] 
follows. Hereby, for every e > 0, there exists a p > such that for 9 E Sp = {9 & Q : 
d{9, 9o) < p} we have 

|7(^)-7(^o)| = |7(^) + /y(^o)|<£/2. 

Furthermore, from [9] we have the following strong law of large numbers for N{t): 
As n — )■ oo, for any 7rAr(-; ^o)-integrable function -i? : N — M we obtain 



1 " 

-J2^iN{kA)) ^ J2^{x)Mx;9o). 



k=l 



By combining this with Lemma 13.11 we further obtain that 
ExeN xnN{x; 9o) 



A{n) 



n Y2=l 



Through Slutsky's lemma (see e.g. [I^), when combining 
this convergence with Lemma lA.ll (recall the continuity and the bounds related to 



so 
Mo ' 
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/y(6'; Zi)), we get that 



sup 

eeSp 



n ^-^ n 

1=1 



sup 



Ain) 



Y,W{e;Z,)-E, 



00 



Iy{0; Zi) 



A{n) a. 



0. 



n 



But since ^^Egg [/y (6'; Zj)] f^^fo -^i)] , as n — oo, we have that 



sup 



A(n) 



Hence, there a.s. is some integer n* such that n > n* imphes that 



sup 



A{n) 
i=l 



< e/2. 



Now, by choosing n* large enough to have On G when n > n* we have that 

Ain) 

n 



By in) < 







- V/y(^o + x(^n-^o);^i) + /y( 



^0. 



i=l 



dx 



< I sup 



A{n) 

-J2iy(9-Z,)-^{9) 

i=l 



7(^) + /y(^o) 



dx < e. 



We now finally find the inverse of the Fisher information, J(6'o)~^, which is given 
by the covariance matrix of the asymptotic Gaussian distribution of y/n[9n — Oq) in 
the statement of the theorem. 

□ 



42 



A. 3 Log- likelihood derivatives 

When Z ~ r(2Ao/o-^, (t^A'o/2Ao) we have that 

E,JZ4] = i^o'(Ao + o^o')(2Ao + a2)(2Ao + 3a2)/4A3, 
EgJlogZ] = log(i^o(ToV2Ao)+^/'(2AoK), 

E,Jlog(Z)^] = log(j^)'-21og(j^)^(^)+^(^) 

(t) ' 

where = r'(x)/r(x), r(x) is the gamma function, and 



E,Jlog(Z)^] = log(2)^ + 4 log(2)3 log 

3 



+ log(16) log 



An 



6 log 



log 



An 



( 2Ao y 



(t) 



Aq 

4 

- 4 log 



-61og(2)2log 
/ 2Ao 



(- 



An 



2Ar 



log 



2Ar 



+ log(8) + 3 loe 



2Ar 



0"^ 



4 log(2)V^'' 



Aq 
- 4 log 



2Ao 



or, 



2Ao 



An 



r 



2Af 



2Ar 



2A ^ ^ 



Note that these expressions readily can be obtained by using the software Math- 
ematica. 

Furthermore, by writing Iy{0]z) = log 'jt{z] X, K, a) for simplicity, the partial 
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derivatives of /y(6'; z) w.r.t. A, K and cr are given by 

\og{z) - i + 1 + log(2) - log (^) - ^ (^) 

2A(z-/s:) 





z) 


d\ 








dK 






z) 


da 






z) 


dX^ 






z) 


dXda 




z) 


dKda 







(A.9) 



f - log(z) - log(2) - 1 + log (^) + ^ (g) 



ctV4A 



2 o\ // /2AAA dHYi9;z) 2{z-K) 



a' - 2X4/ 



Xa^ \ V^V/ ' 9XdK a^K^ 

f - log(z) - 2 - log(2) + log (^) + ^ (^) + g^V^- (f^) 

4X{K - z) dHyiQ; z) _ 2X{K - 2z) 

-f + log(^) + ^ - log (^) - (!^) - (!^) 
' aVl2A 
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